[SQL][minor] use catalyst type converter in ScalaUdf by cloud-fan · Pull Request #6182 · apache/spark

cloud-fan · 2015-05-15T10:11:05Z

It's a follow-up of #5154, we can speed up scala udf evaluation by create type converter in advance.

AmplabJenkins · 2015-05-15T10:12:10Z

Merged build triggered.

AmplabJenkins · 2015-05-15T10:12:16Z

Merged build started.

cloud-fan · 2015-05-15T10:13:36Z

Use the same benchmark:

import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.types._

case class Floor(child: Expression) extends UnaryExpression with Predicate {
  override def foldable = child.foldable
  def nullable = child.nullable
  override def toString = s"Floor $child"

  override def eval(input: Row): Any = {
    child.eval(input) match {
      case null => null
      case ts: Int => ts - ts % 300
    }
  }
}

object T {
  def benchmark(count: Int, expr: Expression): Unit = {
    var i = 0
    val row = new GenericRow(Array[Any](123, 21, 42))
    val s = System.currentTimeMillis()
    while (i < count) {
      expr.eval(row)
      i += 1
    }
    val e = System.currentTimeMillis()

    println (s"${expr.getClass.getSimpleName}  -- ${e - s} ms")
  }
  def main(args: Array[String]) {
    def func(ts: Int) = ts - ts % 300
    val udf0 = ScalaUdf(func _, IntegerType, BoundReference(0, IntegerType, true) :: Nil)
    val udf1 = Floor(BoundReference(0, IntegerType, true))

    benchmark(1000000, udf0)
    benchmark(1000000, udf0)
    benchmark(1000000, udf0)

    benchmark(1000000, udf1)
    benchmark(1000000, udf1)
    benchmark(1000000, udf1)
  }
}

before:
ScalaUdf -- 151 ms
ScalaUdf -- 127 ms
ScalaUdf -- 128 ms
Floor -- 23 ms
Floor -- 4 ms
Floor -- 5 ms

after:
ScalaUdf -- 28 ms
ScalaUdf -- 12 ms
ScalaUdf -- 8 ms
Floor -- 22 ms
Floor -- 4 ms
Floor -- 4 ms

SparkQA · 2015-05-15T10:14:16Z

Test build #32813 has started for PR 6182 at commit 241cfe9.

SparkQA · 2015-05-15T12:08:04Z

Test build #32813 has finished for PR 6182 at commit 241cfe9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-05-15T12:08:08Z

Merged build finished. Test PASSed.

AmplabJenkins · 2015-05-15T12:08:09Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32813/
Test PASSed.

JoshRosen · 2015-05-17T23:34:50Z

LGTM; seems like a pretty straightforward optimization. I'm a bit new to Spark SQL, though, so I'll wait for committer with SQL experience to do the final sign-off on this.

yhuai · 2015-05-17T23:50:29Z

LGTM

yhuai · 2015-05-17T23:51:26Z

I am merging it to master and branch 1.4.

It's a follow-up of #5154, we can speed up scala udf evaluation by create type converter in advance. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #6182 from cloud-fan/tmp and squashes the following commits: 241cfe9 [Wenchen Fan] use converter in ScalaUdf (cherry picked from commit 2f22424) Signed-off-by: Yin Huai <yhuai@databricks.com>

It's a follow-up of apache#5154, we can speed up scala udf evaluation by create type converter in advance. Author: Wenchen Fan <cloud0fan@outlook.com> Closes apache#6182 from cloud-fan/tmp and squashes the following commits: 241cfe9 [Wenchen Fan] use converter in ScalaUdf

use converter in ScalaUdf

241cfe9

asfgit closed this in 2f22424 May 17, 2015

cloud-fan deleted the tmp branch May 18, 2015 02:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SQL][minor] use catalyst type converter in ScalaUdf#6182

[SQL][minor] use catalyst type converter in ScalaUdf#6182
cloud-fan wants to merge 1 commit intoapache:masterfrom
cloud-fan:tmp

cloud-fan commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

cloud-fan commented May 15, 2015

Uh oh!

SparkQA commented May 15, 2015

Uh oh!

SparkQA commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

JoshRosen commented May 17, 2015

Uh oh!

yhuai commented May 17, 2015

Uh oh!

yhuai commented May 17, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

cloud-fan commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

cloud-fan commented May 15, 2015

Uh oh!

SparkQA commented May 15, 2015

Uh oh!

SparkQA commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

AmplabJenkins commented May 15, 2015

Uh oh!

JoshRosen commented May 17, 2015

Uh oh!

yhuai commented May 17, 2015

Uh oh!

yhuai commented May 17, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants